The underlying dataset for this report is based on “Teen Relationship Survey Pretest”, obtained from the URL:http://www.pewinternet.org/datasets/sep-25-oct-9-2014-and-feb-10-march-6-2015-teens/
This survey is a KnowledgePanel research study conducted by the Pew Research Center and focuses on questions about usage of social media, both adults and their children aged 13 through 17.
Specifically for teens, the survey touches a variety of topics related to friendships and romantic relationships, particularly the role technology (such as mobile phones, Facebook, etc.) play in those relationships. On the other hand, the survey asks parents about technology, and the ways their children and the parents use a variety of digital devices and social media platforms.
The dataset has 1642 respsonse (number of parents/kids who completed the survey) and 317 variables (answers of each question).
The full code for cleaning the original dataset can be found in the R script. Briefly summarised, the following steps were executed to eliminate invalid entries:
Given the wide breadth of questions in the survey, it would be interesting to investigate three main aspects of how social media relates to (i) parent and teenage child relationships; (ii) teenage child and friends relationships and lastly (iii) the teen’s self-perception in relation to social media. Each of these three areas have sets of questions and hypotheses which are examined using the given dataset. These are covered in detail under section 3, with an executive summary of the questions, hypotheses and findings, preceding the subsection.
The three categories are:
Findings on relationship between teenagers and their parents
Findings on relationship between teenagers and their close friends/significant others
Findings on self perception
The dataset has a number of questions that were posed to parents, and a number of similar questions posed to their children.
A number of hypotheses were investigated using this dataset, to see if there are any relationships between parent’s behaviours and the teen’s behaviours with respect to social media and technology usage.
For example, parents were asked whether they monitor their child’s location (could be considered as a form of stalking behaviour), and whether the teen monitored someone they were dating or had dated (similar stalking behaviour). This was an opportunity to investigate, if teens were more likely to exhibit stalking behaviour based on their parents initial behaviour.
There were also questions relating to whether the parents restrict their child’s phone usage and whether the child sends sexy/flirty messages. One of the hypotheses was thus whether the child would be more likely to send sexy/flirty messages if the parent blocks their child’s phone usage. This could indicate that the parent’s intervention might have an effect of triggering a rebellious behaviour (e.g. the stricter the parent, the more the child is likely to show rebellious behaviour).
Methodology
As the dependent variables for most of the investigations are categorical variables (being responses of survey questions), logistic regression was used for regression analysis instead of linear regression. Linear regression would not have made sense as its coefficients would not translate into anything meaningful. Dependent variables where the possible responses were “No” or “Yes”, i.e. binary 0 or 1 response, were chosen for analysis.
For logistic regression, the function glm was used, model <- glm(y ~ x1 + x2 + .. + xk, data = dataframe, family = “binomial”).
In most cases, the independent variables were also survey responses that could have more than 2 options. e.g. 1.Yes, a lot, 2.Yes, a little, 3.No. Researching literature online, there are two possible approaches: (1) leave such variables as numeric (R would interpret them as continuous variables), or convert them as factors (R would interpret them as categorical variables, and regression would yield a coefficient for each possible option by creating something like dummy variables). Setting these variables as continuous variables might be contentious as the values of the variables were discrete in nature, and one could not necessarily say that the possible values were equally spaced. It might be difficult to explain why one option had value 1, and not 2 or 50 etc. Hence on balance, these variables were converted to factor, even though this would increase the total number of variables used in the regression and inflate the R2 value (AIC value for logistic regression). All dependent and independent variables were reordered so that the higher the value, the higher the scale of the response. i.e. instead of 1 for Yes and 2 for No, it was reordered to 0 for No and 1 for Yes.
It’s also important to note here that the models based on logistic regression had a different way of interpreting the coefficients. Firstly, the dependent variable in logistic regression is its logarithmic transformation. So the exponential function was used first to convert the coefficients of independent variables. Then if the coefficient of a variable x was (1+k), it meant that a one unit increase in variable x (or x being 1 instead of 0 in case of binary variables) will imply a (k*100)% more (or less if k is negative) chance of the dependent variable being true.
The five questions that were looked into in this section, their corresponding hypotheses and associated findings are summarised in the section below.
Effects of Parent and child behaviour, on teen’s use of social media
The table below summarises the questions studied, the initial hypothesis, and findings from the data.
| Question | Hypothesis | Findings |
|---|---|---|
| 1. Relationship between parents blocking their child’s phone usage and their child’s online behaviour on sexy/flirty pictures | If teens are rebellious, then the stricter the parent, the more the child wants to be liberal and may be more inclined to send flirty pictures | The regression results suggest that if parents take away their child’s cell phone or internet privileges as punishment, the teen is 169% more likely to send sexy or flirty pictures or videos of themselves. There is no statistically significant effect of other strict behaviours of parents: parent using parental controls to restrict child’s use of his/her cell phone, and limiting the amount of time or times of day when child can go online. |
| 2. Do children self-censor their posts if they are friends with parents on social media | Teens whose parents are friends with them on social media are not likely to post details of their relationship online | There was no statiscally significant evidence to show whether the parent being connected to their child on social media had an impact on child posting public affection for their significant other (proxy of whether child self-censors their posts). Upon adding more variables to the regression, the regression suggested that teens feeling pressure to only post content that makes them look good to others, were around 90% (“Yes, a little pressure”) to 214% (“Yes, a lot pressure”) more likely to post public affection towards their significant other. There was no significant statistical evidence to show that various teen’s parent’s behaviours (connecting with child on social media, being friends on Facebook, and checking child’s social media profile) have an effect on teen’s public affection behaviour. This seems to suggest that teens are more influenced by their self-perception, rather than their parent’s interventions, with respect to whether or not they post public affection towards his/her significant other. |
| 3. Relationship between the parent using internet / social media themselves vs. them talking to their child about inappropriate online behavior | Parents are likely to be in a better position to advise their children on online behavior if they themselves use internet or are on social media | The results show that parents who use at least some form of internet themselves are more likely to understand what goes on around and talk to their children about inappropriate online behavior. The parents_advice_score increses by 0.27 if the parent uses internet, which is about 0.17 standard deviations above the mean of parents_advice_score. Similarly, parents are more likely to advise about inappropriate online behavior if their children are younger, and the parents_advice_score decreases by 0.15 on average for every one year increase in age. This makes sense as younger kids are less aware when they are new to internet. However, it can be argued that parents should pay just as much attention to older kids as they are more likely to engage in inappropriate behavior in their late teens. |
| 4. Relationship between stalking behaviour of parent to child and stalking behaviour of child to his/her boyfriend/girlfriend | If parents stalk teen, teen is more likely to inherit the stalking behaviour and stalk his/her boyfriend/girlfriend | It seemed that if parent monitors their child’s location (parent’s stalking behaviour), teens are around 190% more likely (at 90% confidence level) to access the phone of someone they were dating or used to date (child’s stalking behaviour). Also, it appeared that if parent monitors their child’s location (parent’s stalking behaviour), teens are around 567% more likely (at 90% confidence level) to track the location of someone they were dating or used to date (child’s stalking behaviour). In both cases, parent monitoring child’s location seemed correlated with both child’s stalking behaviours (accessing phone and tracking GPS location). |
| 5. How more likely are children to trust their significant others and not have the urge to constantly monitor their activities if their parents are more trusting of them? | Children are likely to have healthier relationships and not have insecurities regarding their significant others if they are trusted by their parents | The results show that those kids whose parents use any monitoring tools track their locations are about 120% more likely to track their significant others’ activities on social media on average. Also, interestingly, males are about 45% less likely on average to keep track of their significant others on social media. Perhaps a possible reason for this is that females tend to give more importance to their relationships than their male counterparts in general, and perhaps their happiness and well-being is more dependent upon how things are going in their intimate relationships, and hence more concerned to find out more about their significant other. |
Question: Relationship between parents blocking their child’s phone usage and their child’s online behaviour on sexy/flirty pictures
Hypothesis: The stricter the parent, the more the child wants to be liberal.
Background info - data cleaning and manipulation
For this question, the relevant dataset questions are:
[Dependent variable]:
KDATE2_G: Have you ever done any of these things to let someone know you were attracted to them or interested in them? Have you sent them sexy or flirty pictures or videos of yourself?
1.Yes
2.No
[Independent variables]:
P14_F: Have you ever used parental controls to restrict your child’s use of his/her cell phone?
1.Yes
2.No
3.Does Not Apply
P13_D: Have you ever taken away your child’s cell phone or internet privileges as punishment?
1.Yes
2.No
3.Does Not Apply
P13_E: Have you ever limited the amount of time or times of day when your child can go online?
1.Yes
2.No
3.Does Not Apply
Data cleaning
Rows with invalid responses were removed by filling them with NA (e.g. if respondents were supposed to choose only options 1 or 2, but the data showed 3 or -1, these would be invalid responses).
The order of the options were reversed, so that the higher number represents the “most” choice, and 0 for No or None. E.g. for if the question’s original order of choices that respondents could choose from were:
1.Yes, 2.No
The responses’ order were re-ordered to:
0.No, 1. Yes
For questions with options as Yes/No/Does not apply, for respondents who responded Does not apply, the data record was removed (by filling it with NA), and order No as 0 and Yes as 1.
## after data cleaning, number of valid data records for analysis is: 830
Findings
The regression between whether teen sends flirty messages, and the strictness of parents: (i) by controlling cell phone, (ii) taking away privileges, (iii) limiting time spent online, and (iv) age, is shown below.
##
## Call:
## glm(formula = y ~ parent_controls_cell_phone + take_away_privileges +
## limit_time_online + age, family = "binomial", data = na.omit(data_specific))
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.6945 -0.4843 -0.3711 -0.3008 2.5599
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -8.95810 1.59326 -5.622 1.88e-08 ***
## parent_controls_cell_phone1 -0.10125 0.32216 -0.314 0.75330
## take_away_privileges1 0.98933 0.32537 3.041 0.00236 **
## limit_time_online1 -0.16482 0.26913 -0.612 0.54026
## age 0.39233 0.09717 4.038 5.40e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 508.20 on 829 degrees of freedom
## Residual deviance: 480.41 on 825 degrees of freedom
## AIC: 490.41
##
## Number of Fisher Scoring iterations: 5
## (Intercept) parent_controls_cell_phone1
## 0.0001286901 0.9037054546
## take_away_privileges1 limit_time_online1
## 2.6894332059 0.8480453622
## age
## 1.4804232140
The take_away_privileges variable is statistically significant with 99% confidence level.
The regression results suggest that if parents take away their child’s cell phone or internet privileges as punishment, the teen is 169% more likely to send sexy or flirty pictures or videos of themselves. There is no statistically significant effect of parent using parental controls to restrict child’s use of his/her cell phone, and limiting the amount of time or times of day when child can go online.
Given that take_away_privileges was significant, the results suggest that parent’s strictness behaviour did affect teen’s behaviour on social media. However, to be more conclusive, more aspects of parental strict behaviour, and more teen’s possible rebellious behaviour, could be looked at if time permitted, and if more data was available. For example, there was a lack of data on other proxies of rebellious behaviour or desire to be more liberal due to strictness of parent, besides sending flirty messages. Perhaps the survey could have included questions on whether the child had found alternative ways to access social media, such as using their friend’s internet device or lie to parents that they were in school studying but were actually using a library computer to use social media. With more data, the investigation could be more comprehensive to determine whether parents being strict had made the situation worse than if parents did not intervene.
Question: Do children self-censor their posts if they are friends with parents on social media.
Hypothesis: Teens whose parents are friends with them on social media are not likely to post details of their relationship online.
Background info - data cleaning and manipulation
For this question, the relevant dataset questions are:
[Dependent variable]:
KRSNS3_C: When you use social media do you ever tell your boyfriend, girlfriend or significant other how much you like them in a way that other people can see?
1.Yes
2.No
[Independent variables]:
P10: Are you connected with your child on any social media sites?
1.Yes
2.No
Similar data cleaning and manipulation operations were performed to get rid of invalid responses and to re-order values according from least to most.
## after data cleaning, number of valid data records for analysis is: 313
Findings
##
## Pearson's product-moment correlation
##
## data: data_specific$parent_connect_child and data_specific$teen_public_affection
## t = 1.0954, df = 311, p-value = 0.2742
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.04920595 0.17167423
## sample estimates:
## cor
## 0.06199316
As the p-value is 0.2742, there is no statistically significant evidence to show that the two variables are correlated.
##
## Call:
## glm(formula = teen_public_affection ~ parent_connect_child +
## age, family = "binomial", data = na.omit(data_specific))
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.138 -1.011 -1.010 1.353 1.354
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.3955314 1.3493661 -0.293 0.769
## parent_connect_child1 0.3098837 0.2857402 1.084 0.278
## age -0.0006381 0.0862793 -0.007 0.994
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 424.89 on 312 degrees of freedom
## Residual deviance: 423.70 on 310 degrees of freedom
## AIC: 429.7
##
## Number of Fisher Scoring iterations: 4
## (Intercept) parent_connect_child1 age
## 0.6733221 1.3632666 0.9993621
The factor of parent connecting to child is not statistically significant.
Further examination of the data:
It seems that when the parent is connected to the child (option 1 / “Yes”), the proportion of teens who express their affection for their significant other publicly, is higher.
A proportion test is done to test this further:
prob_table <- table(data_specific$parent_connect_child, data_specific$teen_public_affection)
# inverse the columns, as success is option 2 for teen_public_affection
prob_table <- cbind(prob_table[, 2], prob_table[, 1])
prop.test(prob_table)
##
## 2-sample test for equality of proportions with continuity
## correction
##
## data: prob_table
## X-squared = 0.90961, df = 1, p-value = 0.3402
## alternative hypothesis: two.sided
## 95 percent confidence interval:
## -0.22359377 0.07121282
## sample estimates:
## prop 1 prop 2
## 0.4000000 0.4761905
The p-value = 0.3402, which suggests that there is no statistically significant evidence to show that the two proportions differ. i.e. similar to the correlation tests, it suggests that the parent social media connection with the child, does not have significant effect on whether the teen shows public affection for significant other on social media.
Given these intial results, more variables which may relate to the hypothesis, were added to the regression model:
KFSNS1_E: In general, does social media make you feel pressure to only post content that makes you look good to others? 1.Yes, a lot,2.Yes, a little,3.No
To test if child’s self perception is more strongly correlated to his/her behaviour regarding posting public affection for significant other
P8: Are you friends with your child on Facebook? 1.Yes, 2.No
To see if parent being friends on Facebook (compared to a more generic social media platform as asked by question P10), is a significant factor
P13_C: Have you ever checked your child’s profile on a social networking site? 1(YES)-2(NO)-3(does Not Apply)
To see if parent checking child’s profile is a significant factor. Perhaps if the child knows that his/her parent checks his/her profile, he/she will be more restrained in his/her post.
## after data cleaning, number of valid data records for analysis is: 223
##
## Call:
## glm(formula = teen_public_affection ~ parent_connect_child +
## age + feel_pressure + friends_facebook + check_child_profile,
## family = "binomial", data = na.omit(data_specific))
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.5361 -0.9536 -0.8478 1.1872 1.5624
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 1.03173 1.87908 0.549 0.58296
## parent_connect_child1 0.11567 0.36829 0.314 0.75346
## age -0.10619 0.11194 -0.949 0.34278
## feel_pressure1 0.63988 0.31683 2.020 0.04342 *
## feel_pressure2 1.14303 0.44113 2.591 0.00957 **
## friends_facebook1 0.07365 0.40218 0.183 0.85469
## check_child_profile1 -0.17095 0.37311 -0.458 0.64682
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 302.98 on 222 degrees of freedom
## Residual deviance: 291.48 on 216 degrees of freedom
## AIC: 305.48
##
## Number of Fisher Scoring iterations: 4
## (Intercept) parent_connect_child1 age
## 2.8059157 1.1226306 0.8992518
## feel_pressure1 feel_pressure2 friends_facebook1
## 1.8962579 3.1362414 1.0764333
## check_child_profile1
## 0.8428603
The regression suggests that teens feeling pressure to only post content that makes them look good to others, are around 90% (“Yes, a little pressure”) to 214% (“Yes, a lot pressure”) more likely to post public affection towards their significant other. There is no significant statistical evidence to show that various teen’s parent’s behaviours (connecting with child on social media, being friends on Facebook, and checking child’s social media profile) have an effect on teen’s public affection behaviour.
This seems to suggest that teens are more influenced by their self-perception, rather than their parent’s interventions, with respect to whether or not they show public affection towards their significant other on social media.
Question: Relationship between the parent using internet / social media themselves vs. them talking to their child about inappropriate online behavior.
Hypothesis: Parents are likely to be in a better position to advise their children on online behavior if they themselves use internet or are on social media
Background info - data cleaning and manipulation
For this question, the relevant dataset questions are:
[Dependent variable]:
parents_advice_score: This is a calculated column, which is basically the sum of P15_B and P15_C. It is a measure of how often a parent talks to his/her child about appropriate/inappropriate online behavior in general.
Takes values between 2 (lowest) and 8 (highest)
P15_B: How often do you talk with your child about what is appropriate or inappropriate to share online?
1.Never
2.Rarely
3.Occasionally
4.Frequently
P15_C: How often do you talk with your child about what is appropriate or inappropriate content for them to be viewing online?
1.Never
2.Rarely
3.Occasionally
4.Frequently
[Independent variables]:
P2_A (use_facebook): Do you ever use Facebook?
1.Yes
2.No
P2_B (use_twitter): Do you ever use Twitter?
1.Yes
2.No
P2_C (use_internet): Do you ever access the internet on a cell phone, tablet or other mobile handheld device, at least occasionally?
1.Yes
2.No
P2_D (use_other_social_media): Do you ever use some other social media site?
1.Yes
2.No
Data cleaning
Invalid responses were removed by filling them with NA (e.g. if respondents were supposed to choose only options 1 or 2, but the data showed 3 or -1, these would be invalid responses).
The numeric coding was changed such that 0 represented a ‘No’ and 1 a ‘Yes’. E.g. if the question’s original order of choices that respondents could choose from were:
1.Yes, 2.No
The responses’ order were re-ordered to:
0.No, 1. Yes
For questions with options Yes/No/Does not apply, ‘Does not apply’ was treated as an NA.
For the calculated column, ‘parents_advice_score’, if one of the two variables being summed was missing, the other was multiplied by 2 to get the parents_advice_score. If both of the variables had missing values in a row, ‘parents_advice_score’ was also treated as a missing value.
## After data cleaning, the number of valid data records for analysis is: 1070
Findings
First, a simple correlation table is generated for the variables of interest in this question. Although Pearson’s correlation is primarily suited for continuous variables, it can still give some idea of the direction of variables with respect to each other as they are all ordered.
## parents_advice_score use_facebook use_twitter use_other_social_media
## [1,] 1 0.04818765 0.03943719 0.04881136
## use_internet
## [1,] 0.07040695
The correlations of parents_advice_score are pretty much the same with all variables, just a little higher for use_internet. Tests were run on each of these and only the correlation of use_internet came out to be significantly different from 0 (positive) at 5% significance level. Its results are shown below:
##
## Pearson's product-moment correlation
##
## data: local_data4$parents_advice_score and local_data4$use_internet
## t = 2.3099, df = 1071, p-value = 0.02108
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.0106054 0.1297067
## sample estimates:
## cor
## 0.07040695
Visual examination of how the parents_advice_score varies with child’s gender and age:
The first graph suggests that parents might be slightly more likely to talk about inappropriate online behavior to their female children as compared to male children, which is interesting as the general perception is that males need this advice more as they are more likely to indulge in such behaviors.
The second graph suggests that parents might be less likely to talk to their children about inappropriate online behavior as they get older, which is expected.
Now, t tests are conducted to see if the differences in parents_advice_score based on gender and age are statistically significant:
Mean test for gender
##
## Welch Two Sample t-test
##
## data: local_data4$parents_advice_score[local_data4$gender == "Male"] and local_data4$parents_advice_score[local_data4$gender == "Female"]
## t = -1.5124, df = 1071.5, p-value = 0.06536
## alternative hypothesis: true difference in means is less than 0
## 95 percent confidence interval:
## -Inf 0.01302965
## sample estimates:
## mean of x mean of y
## 6.243902 6.391144
Mean test for age
##
## Welch Two Sample t-test
##
## data: local_data4$parents_advice_score[local_data4$age == 13 | local_data4$age == and local_data4$parents_advice_score[local_data4$age == 16 | local_data4$age == 14 | local_data4$age == 15] and 17]
## t = 4.2845, df = 891.97, p-value = 1.015e-05
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
## 0.2625846 Inf
## sample estimates:
## mean of x mean of y
## 6.489130 6.062645
The difference in parents_advice_score based on gender is not significant at 5%, although the p-value of 0.065 tell that it would be sigificant at 10% significance level.
For age, the responses were divided into two clusers, one with children aged 13-15 and the other with children aged 16-17. The difference in parents_advice_score turns out to be significant even at 1%.
Initial Regression Model:
##
## Call:
## lm(formula = parents_advice_score ~ use_facebook + use_twitter +
## use_internet + use_other_social_media + age + gender, data = local_data4)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.7443 -0.6870 0.0081 1.4142 2.4087
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 8.32264 0.51751 16.082 < 2e-16 ***
## use_facebookYes 0.13986 0.11189 1.250 0.2116
## use_twitterYes 0.05655 0.13082 0.432 0.6656
## use_internetYes 0.24929 0.12916 1.930 0.0539 .
## use_other_social_mediaYes 0.07309 0.11542 0.633 0.5267
## age -0.15134 0.03365 -4.498 7.62e-06 ***
## genderMale -0.15854 0.09640 -1.645 0.1003
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.578 on 1065 degrees of freedom
## (9 observations deleted due to missingness)
## Multiple R-squared: 0.02801, Adjusted R-squared: 0.02254
## F-statistic: 5.115 on 6 and 1065 DF, p-value: 3.424e-05
An initial regression model that includes all of the variables suggests that only age and use_internet are significant at some level.
Regressions with multiple other combinations of these variables are tried and then the least significant variables use_twitter and use_other_social_media are taken out to reach the following final model:
##
## Call:
## lm(formula = parents_advice_score ~ use_facebook + use_internet +
## age + gender, data = local_data4)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.791 -0.640 -0.028 1.373 2.411
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 8.32154 0.51703 16.095 < 2e-16 ***
## use_facebookYes 0.16478 0.10860 1.517 0.1295
## use_internetYes 0.27418 0.12643 2.169 0.0303 *
## age -0.15147 0.03362 -4.505 7.38e-06 ***
## genderMale -0.15752 0.09634 -1.635 0.1023
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.577 on 1067 degrees of freedom
## (9 observations deleted due to missingness)
## Multiple R-squared: 0.02718, Adjusted R-squared: 0.02353
## F-statistic: 7.452 on 4 and 1067 DF, p-value: 6.406e-06
The results show that both use_internet and age are significant at 5%. It means that parents who use at least some form of internet themselves are more likely to understand what goes on around and talk to their children about inappropriate online behavior. The parents_advice_score increses by 0.27 if the parent uses internet, which is about 0.17 standard deviations above the mean of parents_advice_score.
Similarly, parents are more likely to advise about inappropriate online behavior if their children are younger, and the parents_advice_score decreases by 0.15 on average for every one year increase in age. This makes sense as younger kids are less aware when they are new to internet. However, it can be argued that parents should pay just as much attention to older kids as they are more likely to engage in inappropriate behavior in their late teens.
Although use_facebook and gender are not significant here at 10% level, more data might make these two variables significant in this model as well. This would suggest that parents who specifically use facebook would be more likely to talk about inappopriate online behavior to their children, and also that female children are more likely to be advised as compared to male children. It would be interesting to see how this gender bias varies across different cultures and races.
Question: Relationship between stalking behaviour of parent to child and stalking behaviour of child to his/her boyfriend/girlfriend.
Hypothesis: If parents stalk teen, teen is more likely to inherit the stalking behaviour and stalk his/her boyfriend/girlfriend
Background info - data cleaning and manipulation
For this question, the relevant dataset questions are:
[Dependent variables]:
KR13_C: Have you ever done any of the following to someone you were dating or used to date. Accessed their mobile phone or online accounts
1.Yes
2.No
KR13_F: Have you ever done any of the following to someone you were dating or used to date. Downloaded a GPS or tracking program to their cell phone without them knowing
1.Yes
2.No
[Independent variables]:
P13_B: Have you ever checked which websites your child visited?
1.Yes
2.No
3.Does Not Apply
P14_G: Have you ever used monitoring tools to track your child’s location with his/her cell phone?
1.Yes
2.No
3.Does Not Apply
P14_H: Have you ever looked at the phone call records or messages on your child’s phone?
1.Yes
2.No
3.Does Not Apply
Similar data cleaning and manipulation operations were performed to get rid of invalid responses and to re-order values according from least to most.
## after data cleaning, number of valid data records for analysis is: 317
Findings
##
## Call:
## glm(formula = access_date_phone ~ check_child_website + monitor_child_location +
## look_child_phone_records + gender + age, family = "binomial",
## data = na.omit(data_specific))
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.8697 -0.4799 -0.3788 -0.3060 2.5490
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -4.9877 2.4699 -2.019 0.0434 *
## check_child_website1 0.2666 0.4489 0.594 0.5525
## monitor_child_location1 1.0593 0.4148 2.554 0.0107 *
## look_child_phone_records1 0.2878 0.4626 0.622 0.5339
## genderFemale 0.6471 0.3979 1.626 0.1039
## age 0.1147 0.1521 0.754 0.4509
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 207.42 on 316 degrees of freedom
## Residual deviance: 193.82 on 311 degrees of freedom
## AIC: 205.82
##
## Number of Fisher Scoring iterations: 5
## (Intercept) check_child_website1
## 0.006821593 1.305573261
## monitor_child_location1 look_child_phone_records1
## 2.884326470 1.333468483
## genderFemale age
## 1.910007714 1.121517752
It seems that if parent monitors their child’s location (parent’s stalking behaviour), teens are around 190% more likely (at 90% confidence level) to access the phone of someone they were dating or used to date (child’s stalking behaviour).
##
## Call:
## glm(formula = track_date_loc ~ check_child_website + monitor_child_location +
## look_child_phone_records + gender + age, family = "binomial",
## data = na.omit(data_specific))
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.0646 -0.2186 -0.1659 -0.1138 3.0206
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 4.1906 4.1111 1.019 0.3080
## check_child_website1 -0.5984 0.8501 -0.704 0.4814
## monitor_child_location1 1.8977 0.7771 2.442 0.0146 *
## look_child_phone_records1 -0.7579 0.8398 -0.902 0.3668
## genderFemale 0.4984 0.7610 0.655 0.5125
## age -0.5275 0.2713 -1.945 0.0518 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 74.668 on 316 degrees of freedom
## Residual deviance: 64.901 on 311 degrees of freedom
## AIC: 76.901
##
## Number of Fisher Scoring iterations: 7
## (Intercept) check_child_website1
## 66.0622783 0.5496711
## monitor_child_location1 look_child_phone_records1
## 6.6704953 0.4686308
## genderFemale age
## 1.6461638 0.5900521
It seems that if parent monitors their child’s location (parent’s stalking behaviour), teens are around 567% more likely (at 90% confidence level) to track the location of someone they are dating or used to date (child’s stalking behaviour). It is interesting to note that the coefficient for this is much higher than the previous child stalking behaviour (accessing phone of significant other). This seems to indicate that a parent’s behaviour would more strongly correlate to child’s similar behaviour (both tracking location).
In both cases, parent monitoring child’s location seemed correlated with both child’s stalking behaviours (accessing phone and tracking GPS location).
Given that only one of the three parent’s stalking behaviour was statistically significally, a possible alternate hypothesis would be that teen’s stalking behaviour was influenced more by his/her own characteristics. Various other questions regarding the teen was explored, and the associated hypothesis (i.e. why these questions were studied), are as listed below.
KR8: How frequently you expect to hear from your boyfriend/girlfriend/significant other in some way? 1.Hourly,2.Every few hours,3.Once a day,4.A few times a week,5.Once a week,6.Less often
KR3_A: Have you ever searched for information online about someone you were currently dating or were interested in dating? 1(YES)-2(NO)
KR3_C: Have you ever searched for information online about someone you dated or hooked up with in the past? 1(YES)-2(NO)
KF15: Have you ever had a fight with any of your friends that started because of something that happened online or because of a text? 1(YES)-2(NO)
KFSNS1_B: In general, does social media make you feel worse about your own life because of what you see from other friends on social media? 1.Yes, a lot,2.Yes, a little,3.No
KFSNS1_E: In general, does social media make you feel pressure to only post content that makes you look good to others? 1.Yes, a lot,2.Yes, a little,3.No
##
## Call:
## glm(formula = track_date_loc ~ check_child_website + monitor_child_location +
## look_child_phone_records + feel_worse + gender + age, family = "binomial",
## data = na.omit(data_specific))
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.4557 -0.1925 -0.1552 -0.1155 3.0441
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 2.6636 4.6562 0.572 0.5673
## check_child_website1 -0.6114 0.9956 -0.614 0.5391
## monitor_child_location1 0.8966 0.9302 0.964 0.3351
## look_child_phone_records1 -0.8473 0.9826 -0.862 0.3885
## feel_worse1 0.3441 1.2156 0.283 0.7771
## feel_worse2 2.4532 1.0162 2.414 0.0158 *
## genderFemale 0.0326 0.8686 0.038 0.9701
## age -0.4163 0.2988 -1.393 0.1635
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 65.315 on 276 degrees of freedom
## Residual deviance: 51.912 on 269 degrees of freedom
## AIC: 67.912
##
## Number of Fisher Scoring iterations: 7
## (Intercept) check_child_website1
## 14.3484541 0.5425950
## monitor_child_location1 look_child_phone_records1
## 2.4512708 0.4285541
## feel_worse1 feel_worse2
## 1.4107123 11.6257104
## genderFemale age
## 1.0331405 0.6594689
After testing for these various questions, only the “feel worse” question was statistically significant. It is interesting to see that feeling a lot worse (feel_worse_2) had a much higher coefficient. So perhaps if one had lower self esteem, he/she would be more likely to stalk his/her significant other. It is also interesting to note that adding the feel_worse variable made the parent monitor_child_location variable no longer statistically significant.
Question: How more likely are children to trust their significant others and not have the urge to constantly monitor their activities if their parents are more trusting of them?
Hypothesis: Children are likely to have healthier relationships and not have insecurities regarding their significant others if they are trusted by their parents
Background info - data cleaning and manipulation
For this question, the relevant dataset questions are:
[Dependent variable]:
KRSNS3_A (kid_tracking_significant_others_activity): When you use social media do you ever keep track of where your boyfriend, girlfriend or significant other is or what they are doing? 1.Yes
2.No
[Independent variables]:
P13_A (used_parental_controls): Have you ever used parental controls or other technological means of blocking, filtering or monitoring your child’s online activities?
1.Yes
2.No
3.Does Not Apply
P13_B (checked_websites_visited): Have you ever checked which websites your child visited?
1.Yes
2.No
3.Does Not Apply
P14_G (tracked_childs_location): Have you ever used monitoring tools to track your child’s location with his/her cell phone?
1.Yes
2.No
3.Does Not Apply
P14_H (tracked_calls_and_messages): Have you ever looked at the phone call records or messages on your child’s phone?
1.Yes
2.No
3.Does Not Apply
Data cleaning
Similar data cleaning and manipulation operations were performed to get rid of invalid responses and to re-order values from least to most.
## After data cleaning, the number of valid data records for analysis is: 279
Findings
A simple correlation table is generated for the variables of interest in this question. Again, Pearson???s correlation is primarily suited for continuous variables, here it will just give us some idea of the direction of variables with respect to each other.
## kid_tracking_significant_others_activity used_parental_controls
## [1,] 1 0.0583896
## checked_websites_visited tracked_childs_location
## [1,] 0.05163415 0.1549288
## tracked_calls_and_messages
## [1,] 0.05825054
The correlation of kid_tracking_significant_others_activity with tracked_childs_location is relatively higher as compared to other variables.
Correlation tests were run for each of these and there is significant statistical evidence for only tracked_childs_location that its correlation with kid_tracking_significant_others_activity is different from 0. Its results are shown below:
##
## Pearson's product-moment correlation
##
## data: local_data5$kid_tracking_significant_others_activity and local_data5$tracked_childs_location
## t = 2.6614, df = 288, p-value = 0.008219
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.04047146 0.26537295
## sample estimates:
## cor
## 0.1549288
Visual examination of how different ways that parents use to monitor their children’s activities vary with the child’s gender:
From these graphs, it appears that the parents’ ways to monitor their children do not significantly change based on the child’s gender. However, it’s interesting to see that females are relatively more likely to be monitored by their parents as compared to males. Also, it is perhaps surprising to see that there is a large proportion of parents in general who have tracked children’s calls and messages and their visited websites at least at some point in time.
Visual examination of how kid_tracking_significant_others_activity varies with gender:
Both the graphs suggest that females are more likely to track the activities of their significant others on social media as compared to males, which seems interesting if this behavior holds generally outside this sample too.
Now, a proportion test is conducted to see if the difference in kid_tracking_significant_others_activity based on gender is statistically significant:
##
## 2-sample test for equality of proportions with continuity
## correction
##
## data: table(local_data5$kid_tracking_significant_others_activity, local_data5$gender)
## X-squared = 5.4052, df = 1, p-value = 0.01004
## alternative hypothesis: less
## 95 percent confidence interval:
## -1.00000000 -0.04337236
## sample estimates:
## prop 1 prop 2
## 0.4495413 0.6000000
The test results show that the difference in kid_tracking_significant_others_activity based on gender is significant at 5%. It suggests that males are less likely to keep track of their significant other on social media. However, a regression model can be used to further investigate this hypothesis.
An initial regression model is run to see how the parents’ monitoring of their children affects the childrens’ monitoring of their significant others on social media. Again, logistic regression is used since the dependent variable is binary:
##
## Call:
## glm(formula = kid_tracking_significant_others_activity ~ used_parental_controls +
## checked_websites_visited + tracked_childs_location + tracked_calls_and_messages +
## age + gender, family = "binomial", data = local_data5)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.2927 -0.8130 -0.7184 1.1630 1.9084
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.77825 1.71499 -1.037 0.2998
## used_parental_controlsYes 0.16726 0.29953 0.558 0.5766
## checked_websites_visitedYes 0.27233 0.30814 0.884 0.3768
## tracked_childs_locationYes 0.88774 0.32558 2.727 0.0064 **
## tracked_calls_and_messagesYes -0.12845 0.30996 -0.414 0.6786
## age 0.04979 0.10598 0.470 0.6385
## genderMale -0.43482 0.27040 -1.608 0.1078
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 339.67 on 278 degrees of freedom
## Residual deviance: 325.63 on 272 degrees of freedom
## (802 observations deleted due to missingness)
## AIC: 339.63
##
## Number of Fisher Scoring iterations: 4
This model that includes all of the variables suggests that only tracked_childs_location is significant and gender is almost at the border of signficance.
Regressions with multiple other combinations of these variables are tried and then the less significant variables used_parental_controls and tracked_calls_and_messages are taken out to reach the following final model:
##
## Call:
## glm(formula = kid_tracking_significant_others_activity ~ checked_websites_visited +
## tracked_childs_location + age + gender, family = "binomial",
## data = local_data5)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.2701 -0.8479 -0.6941 1.1447 1.9344
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.00990 1.65517 -1.214 0.22463
## checked_websites_visitedYes 0.26521 0.28324 0.936 0.34909
## tracked_childs_locationYes 0.78818 0.30482 2.586 0.00972 **
## age 0.06893 0.10325 0.668 0.50434
## genderMale -0.58996 0.26565 -2.221 0.02637 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 351.89 on 288 degrees of freedom
## Residual deviance: 337.30 on 284 degrees of freedom
## (792 observations deleted due to missingness)
## AIC: 347.3
##
## Number of Fisher Scoring iterations: 4
## (Intercept) checked_websites_visitedYes
## 0.1340024 1.3037090
## tracked_childs_locationYes age
## 2.1993808 1.0713654
## genderMale
## 0.5543491
The results show that both tracked_childs_location and gender are significant at 5%. The coefficients reveal that those kids whose parents use any monitoring tools track their locations are about 120% more likely to track their significant others’ activities on social media on average. It is important to note however that the number of people who use monitoring tools to track their child’s location is not particularly big in this survey, as compared to the other ways used by parents to track their child. So further research would be needed to see if the effect holds in relatively larger samples.
Also, interestingly, males are about 45% less likely on average to keep track of their significant others on social media. This might go against our general perception, which usually is that females tend to trust their partners more. Perhaps a possible reason for this difference is that females tend to give more importance to their relationships than their male counterparts in general, and perhaps their happiness and well-being is more dependent upon how things are going in their intimate relationships, and hence more concerned to find out more about their significant other. Again, more research / data outside this dataset would be helpful to further investigate this result.
The analysis was limited to dependent variables with binary (Yes/No) options. Ideally, dependent variables with more than 2 options should also be considered. A few interesting hypotheses were considered in that respect as well for further study. However, since that would have required an implementation of ordinal logistic regression, perhaps using the polr function in the MASS library, the analysis was kept to these relatively simpler models in the interest of time. With additional time, it would have been interesting to investigate and understand this further.
Also, since there were a lot of missing values in the dataset, some of the models were based on a relatively small sample. With a larger number of complete observations, it might have been possible to get more interesting relationships that would come out to be statistically significant, which might not have been apparent in this relatively small number of complete responses.
A more robust analysis on the trends of the effect of parents’ behaviours on teen’s behaviours with regard to social media usage (e.g. stalking behaviour, defiance / urge to be liberalised etc) could be conducted if there was more data. Perhaps more questions could be asked to parent and teen in this aspect, to further tease out these possible relationships between parent and teen. More historical data, not just for one year, could be collated and analysed to discover if there were any trends, or whether these parent-child relationships were consistent over time, for 13-17 year old teenage groups across cohorts. It would also be interesting if the survey was conducted for older children, beyond 17 year olds, to test if certain hypotheses or relationships between parent and child persist beyond these ages. For example, if data suggest that parent’s stalking behaviour positively correlate with teen’s stalking behaviour, it would be interesting to investigate whether the teen’s stalking behaviour persists even when he/she is, say, 25 years old. If so, it would suggest that parents should be educated to let them realise what appropriate interventions they should be making, and sometimes well-intended interventions, if executed in the wrong way, could perhaps lead to more permanent adverse impact on the teen.
Regarding the analyses used in this section, the approach was rather simplistic where, for instance, not much attention was paid to the behavior of the residuals in the models to ascertain whether they were fairly normal or not. Ideally, this should not be the case and should be looked into to make sure any important variables are not being missed out, which might have led to some biased coefficient estimates in the analysis. Also, when more ordered factor or continuous variables are included to test further relationships, perhaps some transformations of the variables can lead to better fitted models.
In addition, since many of the variables in the dataset have binary responses, a phi coefficient (from psych library) can be a better measure of looking at the interdependence of different variables in the initial testing phase, instead of simply using Pearson’s correlations, which are primarily suited for continuous variables.
Finally, as mentioned earlier, getting a larger set of complete responses by increasing the initial sample size and the use of ordinal logistic regression for exploring some additional relationships can lead to the identification of some more interesting patterns and relationships in the dataset.
The dataset had a number of questions there were posed to the teen about their close friends and relationships with their significant others.
A number of hypotheses were investigated using this dataset, to see if there were any relationships between teen’s behaviours and their relationship with close friends / significant others.
For example, teens were asked about the amount of time they spend with their closest friends (either online or face-to-face) and what social media accounts they own. This was an opportunity to investigate if social media accounts have an effect on the amount of time teenagers spend with their friends.
Effects of Social Media to Teens Relation with Friends and Girlfriends/Boyfriends
The table below summarises the questions being studied, the initial hypothesis, and findings drew from the data.
| Question | Hypothesis | Findings |
|---|---|---|
| 1. Number of friends and followers on social media varies by gender | Female teens seem more attached to social media rather than male, thus number of friends and followers of female users might be higher than male | On average, female teens have more friends on Facebook and more followers on Instagram than male teens. Number of social media accounts that female teens use is also slightly higher than male teens. |
| 2. Relationship between the number of social media accounts teens have and the number of electronic devices teens possess with how much time they spend with their close friends | The more social media accounts/electronic devices a teen has, the more time he/she would spend with their close friends | Test shows a positive correlation between the two variables. Teens tend to spend more time with their close friends - either face-to-face, by phone, or any other media - when they have more social media accounts or electronic devices. This might indicate that social media and electronic devices help teens to communicate or get along with their friends. |
| 3. Relationship between teens online dating experiences with their perception about other people’s image in social media | Teens who think that people tend to show different side of themselves in the social media would be less likely to experience online dating | Results show that there is a correlation between those two variables. One might think that people would be less likely to do online dating when they are aware that other people might be showing a different side of themselves. However, teens in fact do online dating even though they are aware that other people might not be his/herself but yet never met the girlfriend/boyfriend in person. |
| 4. Do teens send flirtatious messages or flirty picture or videos when they are attracted to someone? | Teens in age 13-17 are less likely to send flirty messages, pictures, or videos to someone they find attractive | It turned out that teens with African American background and Japanese background have a positive tendency to send flirtatious messages, pictures, or videos to show their interest in somebody else |
| 5. How do teens react when they break up with their girlfriend/boyfriend? Do they do such thing as block or unfriend their ex on social media or even remove them from phone address book? | Younger teenagers might do such things when they break up but this behavior will diminish as they get older | Results show that female teens tend to unfriend/block their ex more than male teens. It also can be seen that the older the teens, the more likely they are to remove their ex from phone address book. This is in contrast to what one would expect to see before testing. |
Hypothesis: Female teens seem more attached to social media rather than male, thus number of friends and followers of female users might be higher than male
Background info
For this question, the relevant dataset questions are:
K6_1: Which of the following social media do you use? Facebook?
K6_2: Which of the following social media do you use? Twitter?
K6_3: Which of the following social media do you use? Instagram?
K6_4: Which of the following social media do you use? Google+?
K6_5: Which of the following social media do you use? Snapchat?
K6_6: Which of the following social media do you use? Vine?
K6_7: Which of the following social media do you use? Tumbler?
The options respondents had were:
1. Yes
2. No
Number of social media accounts a person has can be calculated by summing up all “Yes” answer
KFB1A: How many friends do you have on Facebook? The options respondents had were ranging from 0 to 9999
KFB1B: How many followers do you have on Instagram? The options respondents had were ranging from 0 to 9999 KFB1C: How many followers do you have on Instagram? The options respondents had were ranging from 0 to 9999
Child_gender: Is your child (the user of social media) male or female? The options respondents had were:
1. Male
2. Female
Findings
Social media behavior may vary by gender. In general, females are assumed to be more active on social media rather than males. In terms of friends or followers on social media, females indeed have more friends or followers compared to male.
Data from survey out of 249 female and 231 male shows that the number of Facebook friends of female users is considerabably higher than the number of Facebook friends that male users have.
Females have on average 304 friends on Facebook while males have 221 friends. The median is also slightly lower for males, which is only 100 friends while femals have 56 more friends. It can be seen from the graph that females have a wider range of Facebook friends, with maximum number of friends 5000. This must be because Facebook policy limits Facebook friends to a maximum of 5000. In this case, it is assumed that user will only have one Facebook account and will not create a new one for having friends more than 5000. From the t test it can be concluded that mean of Facebook friends of female teen users is higher than male teen users as the p-value is 0.03822 in 95% confidence interval.
Besides Facebook, there is another social media account that is currently popular among teenagers: Instagram. The survey data, in which 206 females and 112 males use Instagram, shows that the number of Instagram followers of female users is also higher than the number of Instagram followers that male users have.
This can be counted through survey questions that indicate the gender of social media user and the number of Instagram followers that particular user has.
On average, females have more followers with a number of 417 compared to males with only 280 followers on Instagram. And again, females have a wider range of quartile than male. The lower quantile of females is 68 followers and the upper is 450 while males have a lower quantile with only 34 followers and the upper is around 200 followers lower than female’s.
Instagram is slightly different from Facebook. Using Instagram lets you have followers rather than friends. This means that user A can follow user B even though B does not follow A back. On Facebook, friends are the other users that are recognized or accepted as your friends. So if A is a friend of B then B must also be a friend of A. This might explain why on average, number of Instagram followers is higher than number of Facebook friends for both female and male users.
| Average Number of Friends/Followers | ||
|---|---|---|
| Female User | 304 | 417 |
| Male User | 221 | 280 |
The plot below shows that most females have up to four social media accounts and males have up to three accounts. The median for females is also higher than males as females have three social media accounts while males have two. This might explain why females tend to have more friends and followers on social media. People usually put link of all social media accounts that they have in the profile, for example, they put link of his/her Snapchat account in Instagram profile. Thus the more social media accounts a person has, the more likely a person will be recognized online.
## [1] 2.992495
## [1] 2.248077
Number of Facebook friends that a teen has can be predicted using number of followers on Instagram and Twitter. Regression results show that this regression formula has an R-squared of 0.3396 which means that the variation in number of Facebook friends can be explained as much as 33.9% by having the number of Instagram and Twitter followers.
##
## Call:
## lm(formula = FBFriends ~ IGFol + TwitterFol, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1277.67 -129.26 -82.67 89.06 2140.69
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 119.75637 25.37908 4.719 4.17e-06 ***
## IGFol 0.57449 0.06642 8.649 9.93e-16 ***
## TwitterFol 0.32955 0.10372 3.177 0.0017 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 356.4 on 225 degrees of freedom
## (853 observations deleted due to missingness)
## Multiple R-squared: 0.3454, Adjusted R-squared: 0.3396
## F-statistic: 59.36 on 2 and 225 DF, p-value: < 2.2e-16
This regression could be improved by having another continuous variables that might be correlated with the number of Facebook friends in the next survey such as how much time spent on Facebook, how many friends he/she has in real life e.g. number of school mates, et cetera.
Hypothesis: The more social media accounts/electronic devices a teen has, the more time he/she would spend with their close friends
Background info
For this question, the relevant dataset questions are:
KF12: Now, thinking again about friends, please think about the friend you are closest to someone you can talk to about things that are really important to you, but who is not a boyfriend or girlfriend. How often are you in touch with this person? The options respondents had were:
1. Many times a day
2. Once a day
3. A few times a week
4. Once a week
5. Once every few weeks
6. Less often
7. Do not have a close friend
In order to ease interpretation, it is better to convert the answer scale of the communication frequency question - 1 for the least and 7 for the most frequent. After converting the scale would change into the following:
1. Do not have a close friend
2. Less often
3. Once every few weeks
4. Once a week
5. A few times a week
6. Once a day
7. Many times a day
K6_1: Which of the following social media do you use? Facebook?
K6_2: Which of the following social media do you use? Twitter?
K6_3: Which of the following social media do you use? Instagram?
K6_4: Which of the following social media do you use? Google+?
K6_5: Which of the following social media do you use? Snapchat?
K6_6: Which of the following social media do you use? Vine?
K6_7: Which of the following social media do you use? Tumbler?
The options respondents had were:
1. Yes
2. No
Number of social media accounts a person has can be calculated by summing up all “Yes” answer
K3_A: Do you have a smartphone?
K3_B: Do you have a cell phone that is not a smartphone?
K3_C: Do you have a desktop or laptop computer?
K3_D: Do you have a tablet computer like an iPad, Samsung Galaxy or Kindle Fire?
K3_E: Do you have a gaming console like an Xbox, PlayStation or Wii?
The options respondents had were yes or no.
Calculate total type of electronic devices that a person possesses by summing up all “Yes” answer. Store the total number in a new variable call “x”.
Findings
People frequently relate social media with friendship, particularly how it might affect the time they spend with their close friends. People might be interested in this area since there were no social media a few years back and now it has fastly grown into so many kinds of social media such as Facebook, Snapchat, and Google+.
In the survey, social media users are asked about how much time they spent with their close friends. This included face-to-face, on the phone, text messaging and all the other ways they might talk to them.
Correlation between total number of social media accounts a person has can be calculated against how often a person interacts with their close friends through any media.
##
## Pearson's product-moment correlation
##
## data: data_specific$y and data_specific$x
## t = 8.5435, df = 1045, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.1979956 0.3112819
## sample estimates:
## cor
## 0.2555156
The correlation of those two variables is 0.256 and the p value from the correlation test is < 2.2e-16 which means that the null hypothesis that the correlation is equal to 0 can be rejected. The 95% confidence interval is between 0.20 to 0.31 which also shows that there exists a positive relationship between number of social media accounts the child has and how much time he/she spends with his/her close friends. Thus the more social media accounts a person has, the more time he/she spends with his close friends, either face-to-face or through communication using the social media.
Another correlation that might exist is the relationship between the number of electronic device types someone has with time spent with their close friends. This includes face-to-face communication and any other way someone may interact with their close friends.
Test correlation between time child spent with close friends and the number of electronic device types the child possesses:
##
## Pearson's product-moment correlation
##
## data: data_specific$y and data_specific$TotalDeviceType
## t = 4.5409, df = 1046, p-value = 6.253e-06
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.07914807 0.19792751
## sample estimates:
## cor
## 0.1390378
The correlation of those two variables is 0.146 and the p value from the correlation test is 1.812e-06 which means that the null hypothesis that the correlation is equal to 0 can be rejected. The 95% confidence interval is between 0.09 to 0.20 which also shows that there exists a positive relationship between number of electronic devices type the child has and how much time he/she spends with his close friends. This might be because electronic devices such as smartphone or laptop help you to communicate more frequently with your close friends and because gaming consoles might also become a way to spend time together with close friends.
These findings can actually be improved and be more precise. It would be better if survey split up the question of time spent with close friends into two questions: “How much time do you spend with close friends face-to-face (e.g. having lunch together, playing basketball, playing games)?” and “How much time do you spend with close friends through any other way (e.g. text messaging, Facebook chats, phone calls)?”
Hypothesis: Teens who think that people tend to show different side of themselves on the social media would be less likely to experience online dating
Background info
For this question, the relevant dataset questions are:
KFSNS3_A: Do you agree or disagree with each of the following statements? People get to show different sides of themselves on social media that they can’t show offline? The options respondents had were: 1. Strongly agree
2. Agree
3. Disagree
4. Strongy disagree
KR2: Have you ever had a boyfriend, girlfriend or significant other that you first met online, but never met in person? The options respondents had were:
1. Yes
2. No
Findings
Social media is not only capturing child’s relationship with friends but also his/her relationship with girlfriend/boyfriend. One that is interesting is their experiences finding boyfriend/girlfriend through social media.
##
## Pearson's product-moment correlation
##
## data: data_specific$x1 and data_specific$y
## t = 2.4989, df = 81, p-value = 0.01448
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.05501593 0.45685763
## sample estimates:
## cor
## 0.2675306
The correlation between those variables turned out to be positive. Teens who perceived that other people show different side of themselves that they can not show offline are more likely to experience having relationship with somebody they met online and never met in person. This means that they find a person more interesting online and hence fall for that person, or maybe one is more able to present the more charming side and true self on social media instead of meeting in person.
Hypothesis: Teens in age 13-17 are less likely to send flirty messages, pictures, or videos to someone they find attractive
Background info
For this question, the relevant dataset questions are:
KDATE2_D: Have you ever done any of these things to let someone know you were attracted to them or interested in them? Have you sent them flirtatious messages? The options respondents had were:
1. Yes
2. No
KDATE2_G: Have you ever done any of these things to let someone know you were attracted to them or interested in them? Have you sent them sexy or flirty pictures or videos of yourself? The options respondents had were:
1. Yes
2. No
QS10_1-15: Please check one or more categories below to indicate what race(s) you consider yourself to be.
Child_gender: Is your child (the user of social media) male or female? The options respondents had were:
1. Male
2. Female
Findings
Correlation between sending flirty messages and Black/African American teens:
##
## Pearson's product-moment correlation
##
## data: data_specific$x2 and data_specific$a
## t = -1.8264, df = 1051, p-value = 0.06808
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.11626379 0.00417964
## sample estimates:
## cor
## -0.05624671
Correlation between sending flirty messages and Japanese teens:
##
## Pearson's product-moment correlation
##
## data: data_specific$x7 and data_specific$a
## t = -2.7905, df = 1051, p-value = 0.005359
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.14541621 -0.02547728
## sample estimates:
## cor
## -0.08575744
Correlation between sending flirty videos and Japanese teens:
##
## Pearson's product-moment correlation
##
## data: data_specific$x7 and data_specific$b
## t = -2.2002, df = 1051, p-value = 0.02801
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.127602854 -0.007330528
## sample estimates:
## cor
## -0.06771269
The correlation of those two variables is -0.068 and the p value from the correlation test is 0.02801 which means that the null hypothesis that the correlation is equal to 0 can be rejected. The 95% confidence interval is between -0.127 to -0.028. Correlation test results show that there are negative correlations - except African Americans: need a 10% level test. This means that Japanese and Black/African Americans are more likely to send flirtatious pictures, videos and messages among all other races such as White, Asian Indian, Chinese, Filipino, Korean and others.
Hypothesis: Younger teenagers might do such things when they break up but this behavior will diminish as they get older
Background info
For this question, the relevant dataset questions are:
KRSNS4_A: Have you ever unfriended or blocked someone that you used to be in a relationship with? The options respondents had were: 1. Yes
2. No
KRCELL_D: Have you ever removed someone that you used to be in a relationship with from your phone address book? The options respondents had were:
1. Yes
2. No
Child_gender: Is your child (the user of social media) male or female? The options respondents had were:
1. Male 2. Female
Child_age: How old is your child (user of social media)? The options respondents had were 13/14/15/16/17 years old.
Findings
Here is the graph showing responses of question “Have you ever unfriended or blocked someone that you used to be in a relationship with?” by gender:
It can be seen that males are more unlikely to unfriended or blocked an ex on social media. This might be because female teens are a little bit more emotional than male teens especially after breaking up.
Again it can be seen that males have less probability to remove his ex from phone address book rather than females towards their ex.
This is another response of question “Have you ever removed someone that you used to be in a relationship with from your phone address book?”:
As responses to these questions might differ by teens age, graph below shows how teens in each of the age groups (13/14/15/16/17) responded.
The graph shows that only a few of teens in age 13-14 ever experienced removing ex from phone address book, while older teens seem more likely to remove ex. Using correlation test between this particular question against teens, age reveals that there is a negative correlation between those variables.
##
## Pearson's product-moment correlation
##
## data: survey_data$Child_age and survey_data$KRCELL_D
## t = -2.0583, df = 341, p-value = 0.04032
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.214160262 -0.004939567
## sample estimates:
## cor
## -0.1107771
Pearson’s correlation test of those two variables is -0.11 and the p value from the correlation test is 0.04032 which means that the null hypothesis that the correlation is equal to 0 can be rejected. The 95% confidence interval is between -0.214 to -0.005. It shows that older teenagers are more likely to remove ex from phone address book rather than younger teenagers. Though it can be argued that teens in age 13-14 are less likely to be in a romantic relationship in the first place.
For some investigations which were done in this section, after filtering out invalid / irrelevant data, a very small sample was left, hence making the results not as convincing (for example question 4 on Japanese and African Americans). A much larger dataset is needed in order to provide larger sample after filtering to give strong evidence to the conclusions.
The findings are summarised as follows:
On average, female teens have more friends on Facebook and more followers on Instagram than male teens. Number of social media accounts that female teens use is also slightly higher than those of male teens.
Tests show a positive correlation between Teens spending more time with their close friends - either face-to-face, by phone, or any other media - and having more social media accounts or electronic devices. This might indicate that social media and electronic devices help teens to communicate or get along with their friends.
One might think that people would be less likely to do online dating when they are aware that other people show a different side of themselves onlone. Perhaps it is this different side on the internet that the other presents which causes one to fall in love with.
It turned out that teens with African American background and Japanese background have a positive tendency to send flirtatious messages, pictures, or videos to show their interest to somebody else!
Results show that female teens tend to unfriend/block their ex more than male teens. It can also be seen that the older the teens the more likely he/she is to remove her ex from phone address book.
In general the dataset offers a lot of questions focusing on the use, general behaviour and perception of social media by parents and their kids between the age of 13 and 17 years.
The previous research was mainly focusing on quite “straight forward” approaches, leading to already interesting results. Therefore the following will focus on a more descriptive and deep dive approach.
Social media like Facebook or Instagram provide a platform that enables people to engage and participate in social activities in a way that is close to the normal social spectrum and furthermore increase the reach of every person to the limits of number of users.
This imposes new challenges for each individual, like defining a close network of peers, differentiating between online and real world relationships - if there is any difference. But one of the most interesting questions is, “How does social media change the way one perceives him/herself, enabling benchmarking against more than 1bn different people, compared to the wider circle of acquaintances 15 years ago?”
According to Huffington Post Article there has been research conducted how Facebook has significant impact on one’s feelings as well as how one’s personality is reflected in certain behavioural patterns on social media. (Article)
Given this, it seems logical to infer that the higher the exposure to social media, the higher the impact on feelings and in consequences on self-perception. Defining a variable to measure “exposure” especially not just expressed as hours spent on Facebook, but also how emotionally people might be engaged to their own social media network, seems to be tricky. Generally the quality or value added of all networks is strongly relying on members of the network, in this case this would be number of friends a person has on a certain social media platform.
The basic assumption therefore is, that the number of friends should have a significant impact on the perceived value added of social media for each individual, therefore should have a positively correlating impact on exposure (the higher perceived value, the more time online and therefore the more exposure) and based on the assumptions stated in the article, this should have an impact on feelings and self-perception.
Hypothesis: “The number of friends on Facebook resp. followers on Instagram, has a significant impact on the perceived value added of social media and consequently about the own life”.
This hypothesis was amended due to findings during the analysis for the following reason. The set of 5 questions, at least in part require a certain capability of self-reflection, emotional intelligence and general maturity. 1. In general, does social media make you feel more connected to information about what’s going on in your friend’s lives? 2. In general, does social media make you feel worse about your own life because of what you see from other friends on social media? 3. In general, does social media make you feel better connected to your friends feelings? 4. In general, does social media make you feel pressure to post content that will be popular and get lots of comments or likes? 5. In general, does social media make you feel pressure to only post content that makes you look good to others?
Especially question 4 and 5 support this assumption. Therefore the hypothesis was amended to the following.
Hypothesis: “The age and the number of friends on Facebook resp. followers on Instagram, have a significant impact on the perceived value added of social media and consequently about the own life”.
Given the holistic nature of the question the following description will predominantly focus on the approach and the rational behind certain decisions.
Prior to the explanation of the initial approach, the following steps need to be carried out.
Besides the standardised global cleaning, which was done centrally for the purpose of the following analysis, there needs to be done further cleaning for the following reason:
The survey assumes different media channels as “social media”. With regards to the hypothesis not all considered media channel by the survey are accurate or applicable e.g. WhatsApp is predominantly a peer-to-peer instant messenger service and doesn’t fulfil the criteria of social exposure within the network. Therefore, in the following the Facebook and Instagram are summarised as social media.
Besides cleaning out not valid responses to all examined questions, all respondents have to be removed that are neither using Facebook nor Instagram. This means that all responses that are considered to be valid for this analysis must have cumulated valid responses to the research questions and have to be at least present in one of the networks.
This cleaning is crucial for the validity of later results, but unfortunately decreases the number of responses from initially 1081 to a final of 755 . In general the comparably small number of responses in the survey, needs to be kept in mind at all times.
The cleaned data set is now ready to be clustered. The selection of Facebook and Instagram as representative networks follows the rational of reciprocity, size, primary way of communication as well as popularity.
Reciprocity: Reciprocal: The connection between members of Facebook (friends) are exclusively reciprocal and need to be requested/approved individually. Therefore there is no mismatch between people be able accessing own vs. accessing other peoples information/posts.
Non-reciprocal: Instagram and Twitter offer non-reciprocal relationships, that can lead to significant mismatches between the information streams. Extreme examples of these patterns can be seen with celebrity profiles, sometime following 200 profiles, but being followed by several million people.
Size: Facebook as well as Instagram are both extremely popular and established networks. Facebook, as the game changer in the early 2000s and Instagram as the innovator, predominantly moving communication from text to picture based. Based on the non-reciprocal character and picture-based communication, Instagram is extremely interesting with regards to exposure and subtle messages, pictures are more capable of including that written messages.
Based on this logic, every respondent has an assigned 33% tercile with 3 being highest 1 lowest, based on the individual number of friends for Facebook and follower on Instagram.
This seems to be in line with the hypothesis, since it can be concluded that respondents with a comparably higher number of friends/followers have comparably higher exposure to the network and therefore might have different views on certain things, like perception of other or pressure to post content.
To ensure the comparability of the different subsets clustered by tercile, limited analysis was conducted based on frequency of usage. The following assumptions were done:
People with an account in both networks can be considered to be the most exposed individuals in the set. Therefore the variance of logins within this subset was compared to the variance of logins in the entire data set among the different clusters. Due to the “either or”" decision with regards to representation in the networks, the correlation between belonging to same cluster in the two different networks could only be conducted in the subset.
There is a moderate correlation with: 0.4795945
The frequency of usage was measured by question KFR11_H
How often do you spend time with friends posting on social media sites? 1.Every day, 2.Every few days, 3. Less often, 4. Never
The variance of logins in the set is 0.7885608, in the subset 0.7797496, which leaves to the conclusion that logins of respondents with no clustering by tercile vary equally among the different sets.
As shown below, the variance of logins varies stronger over the different clusters in the set compared to the subset, while Instagram has almost equally variation of logins.
| Facebook_Sub | Facebook_Total | Instagram_Sub | Instagram_Total |
|---|---|---|---|
| 0.7953976 | 0.8339709 | 0.7601583 | 0.7930846 |
| 0.7867325 | 0.7446632 | 0.7586703 | 0.7631275 |
| 0.7668948 | 0.8250320 | 0.8330001 | 0.7353005 |
As initially mentioned many conclusions relevant for the final approach, evolved during the first approach of analysing the data. After cleaning the data and clustering it, a visual analysis for assumed patterns was conducted with so called heat maps. As mentioned the observed correlation based on the clustering would have been expected to be much clearer, even though some weak patterns, proving the initial thought could be observed, like respondents from cohort 3 feel in general more connected to information than others. Furthermore most of the respondents stating that they “feel worse about their lives” considered themselves to be highly connected to information.
How do responses vary among the clusters by tercile?
How do questions access to information and perceived of own life correlate?
Conclusion:
The heat maps do neither indicate any significant relationships between the number of friends on Facebook, nor the perceived mood (feeling bad about live) with perceived value added of the network (feel connected to information).
Especially the “Yes a little” received comparably high results, which might be due to the “neither nor” character of the answer. Based on these observations, the following two conclusions were drawn:
KFSNS1_B: In general, does social media make you feel worse about your own life because of what you see from other friends on social media? KFSNS1_D: In general, does social media make you feel pressure to post content that will be popular and get lots of comments or likes? KFSNS1_E: In general, does social media make you feel pressure to only post content that makes you look good to others?
receive comparably high negative or indifferent responses. Given the initial thoughts, this might be due to the age of the participants and there mentioned emotional capabilities. To evaluate whether the age has an impact on the way of responding to the dataset was re-clustered by age.
The following results are solely clustered by age, regardless of the tercile for the networks. This was done to look at the impact of age on the results in an isolated way.
The table shows all correlation for all different unique combination of questions per age cohort. It can be observed that the correlation for the same combination of questions slightly vary among the different cohorts. With 130 ,147 ,136 ,172 and 170 the respondents are more or less evenly distributed among the different ages (increasing order starting with 13). Therefore the observation can be assumed to be valid, with not too many variations due to the different sample sizes.
| Cohort 13 | Cohort 14 | Cohort 15 | Cohort 16 | Cohort 17 | Cohort Total | |
|---|---|---|---|---|---|---|
| KFSNS1_A / KFSNS1_B | 0.2794975 | 0.2565006 | 0.1945774 | 0.1091057 | 0.2666693 | 0.2202745 |
| KFSNS1_A / KFSNS1_C | 0.5247100 | 0.5435872 | 0.4920945 | 0.5780586 | 0.4833469 | 0.5256724 |
| KFSNS1_A / KFSNS1_D | 0.2848084 | 0.2753407 | 0.3173093 | 0.2768121 | 0.4298434 | 0.3209615 |
| KFSNS1_A / KFSNS1_E | 0.3025980 | 0.2824830 | 0.3018576 | 0.2467113 | 0.3577742 | 0.2948440 |
| KFSNS1_B / KFSNS1_C | 0.2445661 | 0.2325581 | 0.0791981 | 0.0328683 | 0.2571253 | 0.1731575 |
| KFSNS1_B / KFSNS1_D | 0.5199121 | 0.4502472 | 0.2853243 | 0.3764469 | 0.4975925 | 0.4313815 |
| KFSNS1_B / KFSNS1_E | 0.5576299 | 0.4343617 | 0.4813171 | 0.3555698 | 0.5207501 | 0.4678951 |
| KFSNS1_C / KFSNS1_D | 0.2558098 | 0.2876513 | 0.1682519 | 0.3894248 | 0.3299875 | 0.2963760 |
| KFSNS1_C / KFSNS1_E | 0.2793423 | 0.3345391 | 0.1271861 | 0.2699642 | 0.3071706 | 0.2694057 |
| KFSNS1_D / KFSNS1_E | 0.8115707 | 0.6504855 | 0.6714623 | 0.5872474 | 0.6274787 | 0.6657476 |
The plot confirms the previous observation of varying correlations over the different cohorts for the same pair of question. Looking at question pair KFSNS1_D / KFSNS1_E within cohort 13, its interesting to observe that there is a outstanding high correlation for this pair and only in this cohort(displayed in table above). This might be due to two contradictive assumptions. Questions D as well as E are aiming almost in the exact same direction but slightly differently phrased. Therefore the two possibilities are that either the difference between the questions was not correctly perceived, or given the case that E is a control question for D, 13 year olds are more honest and responding the same way to equal questions.
Following the same logic as above, the analysis of correlations between different questions sets will be done based on a new clustering. The respondents are now clustered by their age, tercile of friends/followers and their responses split up by the two networks Facebook and Instagram.
The number of observations in each cluster is important with regards to the reliability of further conclusions based on this data set.
Occurrences: Displays the number of all respondents that have an account in the corresponding network and cluster
tercile: Value 1 indicates that this cohort belongs to the lowest 10% quantile and therefore shouldn’t be considered due to the small sample size
| Cohort | Occurences_FB | Occurences_IG | tercile_Group_FB | tercile_Group_FB |
|---|---|---|---|---|
| 13 / 1 | 36 | 22 | 1 | 1 |
| 13 / 2 | 39 | 29 | 2 | 2 |
| 13 / 3 | 19 | 36 | 1 | 2 |
| 14 / 1 | 40 | 30 | 2 | 2 |
| 14 / 2 | 47 | 24 | 2 | 1 |
| 14 / 3 | 40 | 33 | 2 | 2 |
| 15 / 1 | 42 | 38 | 2 | 2 |
| 15 / 2 | 41 | 29 | 2 | 2 |
| 15 / 3 | 40 | 26 | 2 | 2 |
| 16 / 1 | 58 | 58 | 2 | 2 |
| 16 / 2 | 39 | 24 | 2 | 1 |
| 16 / 3 | 59 | 35 | 2 | 2 |
| 17 / 1 | 52 | 44 | 2 | 2 |
| 17 / 2 | 43 | 26 | 2 | 2 |
| 17 / 3 | 66 | 37 | 2 | 2 |
| Total / 1 | 228 | 192 | 2 | 2 |
| Total / 2 | 209 | 132 | 2 | 2 |
| Total / 3 | 224 | 167 | 2 | 2 |
Observations:
Highest correlation: The highest correlations can be observed for questions (A/C, B/E, D/E, B/C). Starting with the already mentioned unique combination of D/E, which are closely related questions the following can be concluded from the plot. Looking at all counter examples, so same first question matched with D and E, it becomes obvious that those questions with exception of (B/D, B/E) are more or less equally correlating with the opposite question. Also taking the observation of (D/E) into consideration, it appears to be a valid conclusion that the respondents perceive these two questions similarly.
The same accounts for the combination of (A/C) with both asking about the quality of connection to friends and their feelings respectively. Left with (B/E) this is likely the most interesting question combination, looking at the correlation between feeling “worse about your own life because of what you see from other friends on social media” and feeling the pressure to “only post content that makes you look good to others”. This observation is strongly in line with the initially presented research discussed by the (Huffington Post).
The last interesting observation with regards to the variance of the values is combination (B/C). The observations vary from positively to negatively correlating, which might be explained with the following assumption. Some people might perceive it positively to be connected to friend’s feelings, while other might get jealous and therefore absorb it as negative feeling.
| 13 / 1 | 13 / 2 | 13 / 3 | 14 / 1 | 14 / 2 | 14 / 3 | 15 / 1 | 15 / 2 | 15 / 3 | 16 / 1 | 16 / 2 | 16 / 3 | 17 / 1 | 17 / 2 | 17 / 3 | Total / 1 | Total / 2 | Total / 3 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| KFSNS1_A / KFSNS1_B | 0.0516398 | 0.3846784 | 0.4717945 | 0.1479999 | 0.2966666 | 0.1917819 | 0.2721525 | 0.0000000 | 0.2102082 | 0.1877500 | -0.0127336 | 0.1420755 | 0.4192235 | 0.2639285 | 0.2071129 | 0.2187846 | 0.2096518 | 0.2192974 |
| KFSNS1_A / KFSNS1_C | 0.6348111 | 0.3292377 | 0.6972201 | 0.6594522 | 0.3812483 | 0.6230366 | 0.5958645 | 0.3577667 | 0.5366378 | 0.5439614 | 0.6547480 | 0.5000234 | 0.6119803 | 0.3023899 | 0.4384855 | 0.5913736 | 0.4252869 | 0.5300244 |
| KFSNS1_A / KFSNS1_D | 0.0730297 | 0.3024592 | 0.3719387 | 0.2812593 | 0.1516595 | 0.1613196 | 0.6466862 | 0.0843100 | 0.0575739 | 0.3699212 | 0.3024685 | 0.1932108 | 0.5869392 | 0.4471260 | 0.3615233 | 0.4106855 | 0.2679292 | 0.2482770 |
| KFSNS1_A / KFSNS1_E | 0.0905204 | 0.2911625 | 0.4458151 | 0.3262676 | 0.2762352 | 0.1673923 | 0.4638176 | 0.2338437 | 0.0796735 | 0.3200560 | 0.3435784 | 0.1307670 | 0.4721025 | 0.2421106 | 0.3520961 | 0.3520948 | 0.2742922 | 0.2301222 |
| KFSNS1_B / KFSNS1_C | 0.0546358 | 0.1347215 | 0.4414921 | 0.0845305 | 0.4817815 | 0.1517081 | 0.2530332 | -0.2778478 | 0.1896936 | 0.1169195 | 0.0168685 | -0.0156214 | 0.4693199 | 0.2629336 | 0.1009589 | 0.2211450 | 0.1748126 | 0.1560581 |
| KFSNS1_B / KFSNS1_D | 0.2828427 | 0.5587090 | 0.4440790 | 0.2356196 | 0.5112117 | 0.3120814 | 0.2142467 | 0.2658240 | 0.3898544 | 0.3188903 | 0.2549993 | 0.4688595 | 0.6122175 | 0.4732173 | 0.4517372 | 0.3553603 | 0.4388870 | 0.4230241 |
| KFSNS1_B / KFSNS1_E | 0.3505839 | 0.5981939 | 0.4866933 | 0.2563280 | 0.5398889 | 0.3445345 | 0.5082681 | 0.4346152 | 0.4951565 | 0.2717626 | -0.0802082 | 0.5551718 | 0.5489222 | 0.3851070 | 0.6059660 | 0.3746980 | 0.4226368 | 0.5151954 |
| KFSNS1_C / KFSNS1_D | 0.0772667 | 0.2454460 | 0.3807225 | 0.0746289 | 0.3595861 | 0.3970945 | 0.4561245 | -0.0370788 | -0.0851180 | 0.4388917 | 0.4183272 | 0.3693921 | 0.3951147 | 0.3282660 | 0.3665716 | 0.3131372 | 0.3041539 | 0.3039020 |
| KFSNS1_C / KFSNS1_E | 0.1694432 | 0.1589863 | 0.3543374 | 0.2955529 | 0.4297937 | 0.3741709 | 0.2684246 | -0.0285673 | 0.0107082 | 0.3592339 | 0.2929676 | 0.2741161 | 0.4907480 | 0.3006247 | 0.2270299 | 0.3326682 | 0.2453354 | 0.2591845 |
| KFSNS1_D / KFSNS1_E | 0.7246316 | 0.8610940 | 0.8588140 | 0.5598016 | 0.8212396 | 0.4900980 | 0.6179144 | 0.4330970 | 0.7415153 | 0.5920741 | 0.3397062 | 0.7000992 | 0.7055671 | 0.6693286 | 0.5420921 | 0.6229606 | 0.6708097 | 0.6533829 |
In general the reponses follow the same patterns as observed with Facebook. Looking at the previously identified questions, it occurs that for (B/E) Instagram has slightly more extreme values but a weaker variance, while mean and median are almost equally with 0.44/0.43 and 0.477/0.49 respectively (Instagram first).
The situation for (B/C) is almost the same, with one extreme outlier having a negative correlation of -0.37 from the 15 year old with a lot of followers.
| 13 / 1 | 13 / 2 | 13 / 3 | 14 / 1 | 14 / 2 | 14 / 3 | 15 / 1 | 15 / 2 | 15 / 3 | 16 / 1 | 16 / 2 | 16 / 3 | 17 / 1 | 17 / 2 | 17 / 3 | Total / 1 | Total / 2 | Total / 3 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| KFSNS1_A / KFSNS1_B | 0.0436297 | 0.0417581 | 0.3359233 | 0.1632913 | 0.4101742 | 0.1695592 | 0.2792294 | -0.1238308 | 0.0991363 | 0.1397770 | -0.0303962 | -0.0245855 | 0.2803764 | 0.3002550 | 0.2704166 | 0.1943251 | 0.1401146 | 0.2022345 |
| KFSNS1_A / KFSNS1_C | 0.4339499 | 0.4546438 | 0.4625492 | 0.4840292 | 0.6506496 | 0.6760726 | 0.5519507 | 0.4344339 | 0.3193931 | 0.5383012 | 0.5662867 | 0.4968170 | 0.6729902 | 0.4535374 | 0.5276848 | 0.5500473 | 0.5158331 | 0.5077804 |
| KFSNS1_A / KFSNS1_D | 0.0361288 | 0.1460639 | 0.2889428 | 0.3737521 | 0.2921744 | 0.2208773 | 0.3632369 | 0.2229968 | 0.0606827 | 0.2747647 | 0.3176045 | 0.1790780 | 0.4649438 | 0.2655643 | 0.4165311 | 0.3277107 | 0.2484446 | 0.2698159 |
| KFSNS1_A / KFSNS1_E | 0.0408575 | 0.1220238 | 0.3864364 | 0.2533514 | 0.2903800 | 0.2997621 | 0.2792805 | 0.1322345 | 0.1281495 | 0.2532932 | 0.2071700 | 0.1844705 | 0.4859575 | 0.4426072 | 0.2986332 | 0.2908485 | 0.2402167 | 0.2810403 |
| KFSNS1_B / KFSNS1_C | 0.3028489 | 0.0601385 | 0.1589191 | -0.1071400 | 0.1636634 | 0.0632226 | 0.2435441 | -0.3216153 | -0.3577114 | 0.1969940 | -0.3757346 | -0.0798723 | 0.3096067 | 0.4917931 | 0.2652439 | 0.1843109 | 0.0158371 | 0.0843420 |
| KFSNS1_B / KFSNS1_D | 0.2587746 | 0.4130694 | 0.4261093 | 0.4286926 | 0.0944911 | 0.4219567 | 0.4872515 | 0.4411096 | -0.1406208 | 0.3730673 | 0.3977058 | 0.3114168 | 0.6355361 | 0.4799409 | 0.4721240 | 0.4460334 | 0.3333948 | 0.3657947 |
| KFSNS1_B / KFSNS1_E | 0.2006700 | 0.6982875 | 0.4931470 | -0.0088546 | 0.3286879 | 0.5301507 | 0.5489356 | 0.4480552 | 0.1247741 | 0.4174060 | 0.5526203 | 0.2497625 | 0.5525851 | 0.5999262 | 0.6808509 | 0.3755384 | 0.5012750 | 0.4657734 |
| KFSNS1_C / KFSNS1_D | 0.1055927 | 0.1731589 | 0.1032796 | 0.1276814 | 0.0000000 | 0.2353065 | 0.0992174 | -0.2139714 | 0.0432158 | 0.4110687 | 0.2886751 | 0.3600124 | 0.3553139 | 0.4879500 | 0.4404410 | 0.2828148 | 0.1480700 | 0.2780105 |
| KFSNS1_C / KFSNS1_E | 0.3070621 | 0.0747407 | 0.2888356 | 0.2648561 | 0.2053960 | 0.2941331 | -0.0029764 | -0.2013684 | 0.1477591 | 0.2888636 | 0.0173032 | 0.3026761 | 0.5106006 | 0.4879500 | 0.3017766 | 0.2806577 | 0.1455873 | 0.3013274 |
| KFSNS1_D / KFSNS1_E | 0.7754626 | 0.8354141 | 0.7159396 | 0.6688146 | 0.4743416 | 0.5872483 | 0.7315274 | 0.6519678 | 0.6721083 | 0.5956790 | 0.6993010 | 0.7937821 | 0.5880541 | 0.7142857 | 0.6892864 | 0.6476180 | 0.6642034 | 0.7008922 |
## [1] "KFSNS1_A / \n KFSNS1_B"
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -0.12380 0.04269 0.16330 0.15700 0.27980 0.41020
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -0.01273 0.14500 0.20710 0.21560 0.28440 0.47180
## [1] "KFSNS1_A / \n KFSNS1_C"
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.3194 0.4541 0.4968 0.5149 0.5591 0.6761
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.3024 0.4099 0.5440 0.5245 0.6289 0.6972
## [1] "KFSNS1_A / \n KFSNS1_D"
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.03613 0.20000 0.27480 0.26160 0.34040 0.46490
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.05757 0.15650 0.30250 0.29280 0.37090 0.64670
## [1] "KFSNS1_A / \n KFSNS1_E"
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.04086 0.15840 0.25340 0.25360 0.29920 0.48600
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.07967 0.20060 0.29120 0.28240 0.34780 0.47210
## [1] "KFSNS1_B / \n KFSNS1_C"
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -0.37570 -0.09351 0.15890 0.06759 0.25440 0.49180
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -0.27780 0.06958 0.13470 0.16430 0.25800 0.48180
## [1] "KFSNS1_B / \n KFSNS1_D"
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -0.1406 0.3422 0.4220 0.3667 0.4566 0.6355
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.2142 0.2743 0.3899 0.3863 0.4710 0.6122
## [1] "KFSNS1_B / \n KFSNS1_E"
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -0.008855 0.289200 0.493100 0.427800 0.552600 0.698300
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -0.08021 0.34760 0.48670 0.42010 0.54440 0.60600
## [1] "KFSNS1_C / \n KFSNS1_D"
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -0.2140 0.1012 0.1732 0.2011 0.3577 0.4880
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -0.08512 0.16140 0.36660 0.27900 0.39610 0.45610
## [1] "KFSNS1_C / \n KFSNS1_E"
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -0.2014 0.1112 0.2888 0.2192 0.3022 0.5106
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -0.02857 0.19820 0.29300 0.26520 0.35680 0.49070
## [1] "KFSNS1_D / \n KFSNS1_E"
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.4743 0.6238 0.6893 0.6795 0.7237 0.8354
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.3397 0.5509 0.6693 0.6438 0.7331 0.8611
Given the initial size of the entire set and the necessary cleaning, the cluster ended up being relatively small to make statistically significant observations. Furthermore the selected independent variables number of friends/followers and age must be evaluated under the following assumptions. Looking at dependent variables that are deeply connected to the psyche of human beings, given the age and the along going presumable changes in life of the respondents, using only these two variables is a strong reduction of complexity to explain highly connected and complex occurrences that are influenced by numerous factors.
Second, given the time restriction and scope of the project a deeper analysis was simply not possible. The overall aim of this analysis was to show how clustering by relevant parameters might lead to different results and insights, which will be briefly recaptured.
Starting with the initial heat maps, that only had limited explanatory power, the clustering by tercile of friends and age led to more detailed and differentiated results, clearly showing that the correlations between the responses to the questions are also relying on the cluster criteria. Furthermore as discussed above, there is a correlation between feeling “worse about your own life because of what you see from other friends on social media” and feeling the pressure to “only post content that makes you look good to others”. This observation addresses exactly what the author of the article initially describes with regards to her experience. Given the mentioned limitations regarding the dataset as well as the complexity of the human mind, the hypothesis must be rejected from a statistical point of view, nevertheless the observation showed as well, that exposure expressed in number of friends has an impact on the perception of social media.
The investigation of the dataset revealed several interesting relationships. First, parents’ behaviour seems to affect teen’s behaviour. In particular, parents stalking their teens by monitoring his/her location, has a positive effect on the teen also monitoring his/her significant other’s location or accessing his/her phone. Second, the sample data showed differences between male and female teen’s extent of social media usage. Female teens have more friends on Facebook and more followers on Instagram than male teens. Female teens also use more types of social media. Female teens tend to unfriend/block her ex more likely than males. The data also suggests that social media and electronic devices help teens to communicate or get along with their friends. Last, clustering of responses by age and number of friends/followers shows how the correlations between the responses to selected questions vary among the different clusters, leading to observation that the perceived pressure to post socially accepted content correlates with the perceived life situtation/quality. Given the nature of this question, requiring a non neglectable capability of self-reflection, it can be infered that the precision of the given response correlates with the age of the respondend. The formulated hypothesis however had to be rejected from a statistical point of view, because the given results could not be assumed to be significant. Given the mentioned restriction, the initial suggestion that social media can work as a booster of negative feelings could be observed.
Nevertheless, the dataset was relatively small (especially after data cleaning and given the number of variables covered). A larger dataset would have significantly improved the reliability of the findings, and also enabled to draw more conclusive results. In fact, if time permitted, studies could be done to see whether trends were consistent over years (i.e. for different cohorts of 13-17 year old teens). Other analyses could include investigation, whether such trends would persist beyond teenage years towards adulthood. For example, longitudinal studies could be done to track whether the surveyed teens’ responses to the same questions, such as feeling pressure to appear good, persist when they are in their 30s. Furthermore it could be analysed whether parents’ influences had ‘permanent’ impact on the teen, or whether the impact was transient during teen years).